Close out review debts: perf fix, dedup, latent-trap cleanup, Ladybug migration doc close-out#333
Merged
Merged
Conversation
is_ignored returned (bool, IgnoreLayer) and computed _winning_row (one GitIgnoreSpec rebuild per ignore-rule prefix) on every call. All 7 production callers (iter_java_source_files + java_index_flow_lancedb) discarded the layer; only diagnose-ignore needs source attribution, and diagnose_dict already computes it. is_ignored now returns a plain bool; the two test assertions that checked the layer's source migrate to diagnose_dict. On a repo with ~100 ignore rules this cuts ~5000 spec compilations per indexed file down to one. Co-Authored-By: Claude <noreply@anthropic.com>
- ast_java: four byte-identical _codebase_*_inner_annotation_nodes walkers -> one _inner_annotation_nodes(node, src, target_simple). - graph_enrich: three identical _route/_client/_async _hint_lookup helpers -> one generic _hint_lookup (TypeVar) used at all three call sites. - ladybug_queries: find_callers/find_callees (~50-line near-twins) -> one _walk_calls helper; the two public methods keep their signatures and delegate, differing only in call-graph orientation. - pr_analysis: drop the dead 'notes' local in compute_risk (never appended; the real notes are assembled by analyze_pr_pipeline and merged there). meta()'s 5-level query cascade in ladybug_queries is intentionally left in place: it is reachable via direct LadybugGraph(...) construction that bypasses get()'s ontology gate, and test_feign_not_exoser relies on its fallback, so it is not cleanly dead right after the Ladybug migration. Co-Authored-By: Claude <noreply@anthropic.com>
…terals mcp_v2: 15 error branches passed hints=[] to NeighborsOutput/DescribeOutput, neither of which has a hints field (their fields are advisories / hints_structured, both defaulting to []). pydantic silently dropped the kwarg; today harmless, but the moment anyone adds extra='forbid' to those models every error branch would raise ValidationError swallowed by the catch-all. The models already default to empty, so the dead kwargs are simply removed. build_ast_graph: the four brownfield layer names were spelled out four times (_client_source_layer, _producer_source_layer, brownfield_strategies, and _BROWNFIELD_LAYERS). Promote _BROWNFIELD_LAYERS to the single source of truth and define brownfield_strategies as _BROWNFIELD_LAYERS plus the two caller-side declaration strategies (codebase_client/codebase_producer). The two sets still differ deliberately: _BROWNFIELD_LAYERS gates brownfield_only authoritativeness (per edge), while brownfield_strategies counts annotation-declared callers in the *_from_brownfield_pct stats — now that relationship is explicit instead of two independent literals that looked unrelated. Co-Authored-By: Claude <noreply@anthropic.com>
The KuzuDB->LadybugDB migration (commit #302) landed as code but its doc/string sweep and plan close-out were never finished, leaving operator and agent docs asserting the old store. This completes it: Docs (current-state Kuzu -> LadybugDB, code_graph.kuzu -> .lbug, --kuzu-path -> --ladybug-path, kuzu_queries.py -> ladybug_queries.py, KuzuGraph -> LadybugGraph, kuzu_path -> ladybug_path; ontology 15/16 -> 17): README, AGENTS, docs/CONFIGURATION, docs/JAVA-CODEBASE-RAG-CLI, docs/MANUAL-VERIFICATION-CHECKLIST, docs/CODEBASE_REQUIREMENTS, docs/AGENT-GUIDE, docs/PRODUCT-VISION, tests/README. Factual fixes surfaced by the markdown freshness audit: - README: DECLARES_ROUTE (nonexistent edge) -> EXPOSES; role list no longer lists PRODUCER (a node kind) and now includes COMPONENT/CONFIG/ENTITY; EMBEDDING_MODEL -> SBERT_MODEL (the real env var). - AGENT-GUIDE + SKILL: route frameworks corrected to spring_mvc/webflux (kafka/rabbitmq/jms/stream are route kinds; feign is a client kind). - PRODUCT-VISION: CALLS is shipped, not 'planned'. External citation titles (footnotes 12/17) intentionally left as 'Kuzu'. Shipped-artifact resync + plan close-out: - install_data/{skills,agents} explorer copies re-synced from source (they were behind, missing source_layer and the schema-rejection note). - Moved the landed PLAN/propose for LADYBUG-DB-MIGRATE and INDEX-OUTPUT-REWORK from active/ to completed/. Source docstring/help-string sweep only (cli/pr_analysis/mcp_v2/search_lancedb, conftest, test_ladybug_queries docstrings) — no behaviour change; the one clearly-stale kuzu 0.11.x version reference in mcp_v2 is genericized. Co-Authored-By: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Applies the four prioritized debts from the six-domain codebase review (no big refactor — the architecture was assessed sound; these are the bounded, high-value cleanups).
Debt 2 —
is_ignoredO(n²) on the index hot pathLayeredIgnore.is_ignoredreturned(bool, IgnoreLayer)and ran_winning_row(oneGitIgnoreSpecrebuild per ignore-rule prefix) on every call. All 7 production callers discarded the layer; onlydiagnose-ignoreneeds source attribution, anddiagnose_dictalready computes it.is_ignorednow returns a plainbool; the two tests that checked the layer migrate todiagnose_dict. On a repo with ~100 ignore rules this cuts ~5000 spec compilations per indexed file down to one.Debt 4 — mechanical duplication (4a–4d)
ast_java: four byte-identical_codebase_*_inner_annotation_nodeswalkers → one_inner_annotation_nodes(node, src, target_simple).graph_enrich: three identical_route/_client/_async _hint_lookuphelpers → one generic_hint_lookup(TypeVar).ladybug_queries:find_callers/find_callees(~50-line near-twins) → one_walk_calls; public signatures preserved.pr_analysis: drop the deadnoteslocal incompute_risk(never appended; real notes come fromanalyze_pr_pipeline).Deferred (4e): the
meta()5-level query cascade inladybug_queriesis not cleanly dead — it is reachable via directLadybugGraph(...)construction that bypassesget()'s ontology gate, andtest_feign_not_exoserrelies on its fallback. Left in place; revisit after the migration fully settles.Debt 3 — latent footguns
mcp_v2: 15 error branches passedhints=[]toNeighborsOutput/DescribeOutput, which have nohintsfield (real fieldsadvisories/hints_structureddefault to[]). pydantic silently dropped the kwarg — harmless today, but aValidationErrorlandmine the moment anyone addsextra="forbid". Removed (the models already default to empty).build_ast_graph: the four brownfield layer names were spelled out 4×. Promoted_BROWNFIELD_LAYERSto the single source of truth;brownfield_strategiesis now_BROWNFIELD_LAYERS | {codebase_client, codebase_producer}. Behavior-preserving — the two sets still differ deliberately (layer gate vs. stat counter); that relationship is now explicit instead of two unrelated literals.Debt 1 — close out the LadybugDB migration
The KuzuDB→LadybugDB migration (#302) landed as code but its doc/string sweep and plan close-out were never finished. Completed here:
code_graph.kuzu→.lbug,--kuzu-path→--ladybug-path,kuzu_queries.py→ladybug_queries.py,KuzuGraph→LadybugGraph,kuzu_path→ladybug_path; ontology15/16→17(code was already 17). Across README, AGENTS, CONFIGURATION, CLI guide, verification checklist, CODEBASE_REQUIREMENTS, AGENT-GUIDE, PRODUCT-VISION, tests/README.DECLARES_ROUTE(nonexistent edge)→EXPOSES; role list no longer listsPRODUCER(a node kind) and now includesCOMPONENT/CONFIG/ENTITY;EMBEDDING_MODEL→SBERT_MODEL(the real env var). AGENT-GUIDE + SKILL route frameworks corrected tospring_mvc/webflux(kafka/rabbitmq/jms/stream are route kinds; feign is a client kind). PRODUCT-VISIONCALLSis shipped, not planned. External citation titles left asKuzu.install_data/{skills,agents}explorer copies re-synced from source (were behind — missingsource_layer, schema-rejection note). Moved the landedPLAN/ propose for LADYBUG-DB-MIGRATE and INDEX-OUTPUT-REWORK fromactive/tocompleted/.kuzu 0.11.xversion reference inmcp_v2is genericized.Verification
.venv/bin/ruff check .— clean..venv/bin/python -m pytest tests -q— 838 passed, 13 skipped (skips are theJAVA_CODEBASE_RAG_RUN_HEAVY-gated tests).Notes
SBERT_MODELwas always the real variable; only the doc that saidEMBEDDING_MODELwas fixed.is_ignoredreturn type changed fromtuple[bool, IgnoreLayer|None]tobool(internal API; all callers updated). Breaking change is allowed per repo policy.🤖 Generated with Claude Code